A Study of Page Placement and Migration in Heterogeneous Flat-Addressable Memories
ثبت نشده
چکیده
The volume of data generated by research, commercial, industrial, communication, entertainment and other fields is growing exponentially. There is a need for faster and very large amounts of main memory for analyzing such volumes of data in reasonable amounts of time. In addition, these systems need to be as energy efficient as possible, since the energy requirements of most high-performance computing systems and data centers are becoming significant portions of their operational budgets. This led to research into different memory technologies such as 3D-stacked DRAM, improvements to DDR, and non-volatile memories like PCM, and Flash memories. There have been some studies on how these memory technologies can be used to address the need for very large amount of main memory at reasonable cost and energy budgets. Flat-addressable memories differ from hierarchical view of the memory system: they are organized as a single (flat) physical address space with two or more types of memory devices, each with its own latencies and bandwidth. In such memories the page placement and migration (or swapping) of pages across these different memory devices requires very careful analysis. Previous studies have explored simple migration policies in a system with 3D-DRAM and DDR4 as main memory. Our analysis shows that for some workloads, static page placement with no page migration outperforms policies that consider only page access counts in deciding which pages to migrate. Additionally, these policies may prove to be inefficient for memory systems when PCM is included. In this paper we present and evaluate several intelligent and efficient policies for migration of physical pages across the memory technologies. We study our policies for two level (3D-DRAM + DDR4; and 3D-DRAM + PCM) and three level (3D-DRAM + DDR4 + PCM) memory systems. We present both performance improvements and memory energy savings for our page migration policies. Compared to previous studies, we observe average speedups of 2.6% and energy savings of 65.9% for 3DDRAM + DDR4, average speedups of 10% and energy savings of 76.8% for 3D-DRAM + PCM, and average speedups of 8% and energy savings of 68.5% for 3D-DRAM + DDR4 + PCM.
منابع مشابه
Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کاملSpeeding up the Memory
Scalable Flat Cache Only Memory Architectures (Flat COMA) are designed for reduced memory access latencies while minimizing programmer and operating system involvement. Indeed, to keep memory access latencies low, neither the programmer needs to perform clever data placement nor the operating system needs to perform page migration. The hardware automatically replicates the data and migrates it ...
متن کاملVM Consolidation by using Selection and Placement of VMs in Cloud Datacenters
The Cloud Computing model leverages virtualization of computing resources allowing customers to provision resources on-demand on a pay-as-you-go basis. During recent years, the power consumption of datacenters in cloud environment attracted researchers. Optimization of energy consumption can be performed by different methods including virtual machine (VM) consolidation. This technique can reduc...
متن کاملPage Migration with Limited Local Memory Capacity
Page migration problems arise in distributed data management. The goal is to distribute a set of pages in a network of processors, each of which has its local memory, so that a sequence of memory accesses can be executed e ciently. Most previous work assumes that the local memories have in nite capacities, which is unrealistic in practice. In this paper we study the migration problem under the ...
متن کاملEvaluation of Multiprocessor Memory Systems Using O - Line Optimal Behavior
In recent years, much e ort has been devoted to analyzing the performance of distributed memory systems for multiprocessors. Such systems usually consist of a set of memories or caches, some device such as a bus or switch to connect the memories and processors, and a policy for determining when to put which addressable objects in which memories. In attempting to evaluate such systems, it has ge...
متن کامل